Unidata Outreach Accomplishments and Challenges

Ben Domenico, September 2012

Relationship to Unidata 2013 Proposal

This work relates to several of the proposal goals: 1. Broadening participation and expanding community services; 2. Advancing data services
3. Developing and deploying useful tools; 5. Providing leadership in cyberinfrastructure. 

As noted in the two following sections,  the work was called out specifically in an interaction with the review panel and in the review panel summary.

Review panel question and UPC response

1e. Is the UPC prepared to provide the same quality of support to the newly engaged communities as it provides to its current constituents?

While the support for all users will remain at a very high level, that does not mean it will be exactly the same.   For example, for the core community Unidata provides comprehensive support for a full suite of tools from data services, through decoders, to complete analysis and display packages.  For  other cases, the tools that are specialized to their community may not be available via and supported by the UPC.  One example of this is the community of users of GIS tools.  In that case Unidata supports standards-based web services that make our datasets available in such a way that tools that incorporate those standard interfaces can avail themselves of  Unidata datasets.  Thus these new communities can continue to make use of the analysis and display tools they are familiar with while taking advantage of the data services of the traditional Unidata community. 

Excerpt from the proposal review panel report

Advocacy for Community Standards:  "In particular, the UPC could play a significant leadership role within committees and consortiums like OGC seeking to address the need to develop standards and technologies for data discovery. Unidata leadership and advocacy in this area could facilitate expanded utilization of Unidata information resources for other research areas like climate and provide Unidata users with easier access to other data sources like NASA satellite information. However, the OGC letter of recommendation in the proposal and the Unidata responses to the review panel questions regarding cyberinfrastructure did demonstrate that the Unidata was actively involved in community discussion of interface and data standards."

Relationship to Current Unidata Strategic Plan

Below are a few excerpts from the current Unidata Strategic Plan that highlight the importance of the outreach activities summarized in this status update?

  • ... to build infrastructure that makes it easy to integrate and use data from disparate geoscience disciplines

  • Data formats like netCDF, together with community-based data standards like the Climate and Forecast metadata convention and the Common Data Model are enhancing the widespread usability and interoperability of scientific datasets.

  • Advance geoscience data and metadata standards and conventions

  • ... close partnerships and collaboration with geoscience data providers, tool developers, and other stakeholders,

  • ... our experience shows us that robust solutions arise from community and collaborative efforts

  • ... close partnerships and collaboration with geoscience data providers, tool developers, and other stakeholders, and the informed guidance of our governing committees will all be important catalysts for Unidata’s success.

Summary of Recent Progress

Background on netCDF and CF formal standards efforts

Following on the success of Russ Rew and the netCDF team in establishing netCDF and CF as NASA standards, efforts continue to have CF-netCDF recognized internationally by the  Opengeospatial Consortium (OGC) as standards for encoding georeferenced data in binary form.

As the official UCAR representative to the OGC Technical Committee, Unidata participates in 3-4 technical committee meetings per year to ensure that Unidata and UCAR needs are met in the emerging international standards.

The overall plan and status is maintainted at  http://sites.google.com/site/galeonteam/Home/plan-for-cf-netcdf-encoding-standard.  In keeping with the proposal and review panel recommendations, the goal of this effort is to encourage broader use of Unidata's data by fostering greater interoperability among clients and servers interchanging data in binary form.  Establishing CF-netCDF as an OGC standard for binary encoding will make it possible to incorporate standard delivery of data in binary form via several OGC protocols, e.g., Web Coverage Service (WCS), Web Feature Service (WFS), and Sensor Observation Service (SOS).  For over a year, the OGC WCS SWG is already developing an extension to the core WCS for delivery of data encoded in CF-netCDF.  This independent CF-netCDF standards effort is complementary to that in WCS and hopefully will facilitate similar extensions for other standard protocols.

Progress on OGC standardization

In 2011, the netCDF Classic data model was established as the OGC core netCDF standard.   The binary encoding for the classic data model was established as the first extension to the netCDF core standard.   At this time the netCDF enhanced data model and the CF (Climate and Forecast) conventions have been proposed as extensions to the core.    The OGC-adopted standards documents are available at

http://www.opengeospatial.org/standards/netcdf

Two additional proposed standards are in the pipeline for OGC standardization.  The extension for CF conventions has gone out for public comment, the last step before voting by the full OGC Technical Committee.  The Enhanced (netCDF4) Data Model extension is being voted on by the TC.   As of the day of this writing only, we got two more YES votes that put us over the number needed among the 98 member committee for adoption.

Ongoing Outreach Activities

Earthcube Activities.

Over the last year, much of Unidata's outreach activity has been focused on the NSF Earthcube initiative.    After participating in elaborate preparations for the in the EarthCube charrette last November, including involvement in several whitepapers, Unidata participated in the November charrette (with IEEE covering travel) and the follow on charrette in June.   Now we are involved in several community groups and concept awards.   For some of us, the long term vision for EarthCube still is somewhat unclear although there are several valuable ideas and collaborative relationships forming.  It will be good for Unidata to be aware of those and a part of the ones that make sense. 

One of those relationships has resulted in our being asked to participate in a Data Infrastructure Building Blocks (DIBBs) proposal led by the Sand Diego Supercomputer Center.   If the proposal is funded, part of the project will involve using information from our support email database to help annotate our datasets to support use by our own community and others.   While the resulting system could be of benefit to the Unidata community, our commitment to the effort will be minimal -- providing access to our support email database (with appropriate privacy restrictions) and a liaison to the SDSC group doing the work.


Interactions continue in two "concept award" areas: Brokering and Cross-domain Interoperability.   Our work in these projects is concentrated on making our data more discoverable and accessible.   One key element is to work with a third, middle tier between clients and servers in the web services architecture.   This layer will make metadata from THREDDS Data Servers more readily available to a variety of data discovery systems and will make the datasets themselves more conveniently accessible via many protocols not supported in the TDS itself.   We are working with groups developing software for this brokering tier.   One such product is the open source ESRI GeoPortal which has been the primary target of a recent "hackathon" in which several of TDS servers were involved.  Another is a joint effort with the University of Florence ESSI Lab to experiment with their web services brokering layer tools to determine whether we can simplify our web client and server tools by using a brokering layer to do transformations among metadata and data service protocols and encodings.

OGC Standards Actions

  • Enhanced (netCDF4) data model voted in, press release announcement being prepared
  • CF conventions vote about to start
  • CF-netCDF encoding for Web Coverage Service has been drafted
  • Dicusson Paper published on Uncertainty Conventions for netCDF
  • OPenDAP access protocol needs to be coordinated
  • HDF encoding needs to be coordinated

New and Ongoing Collaborations:

  • NCAR GIS Program
  • Marine Metadata Interoperability (MMI) Project Steering Team
  • CUAHSI Standing Committee
  • UCAR wide representative to OGC Technical Committee
  • AGU ESSI Focus Group Board
  • ESIN Journal Editorial Board
  • Liaison to OOI Cyberinfrastructure Project
  • Several collaborations with EarthCube teams
  • Collaborative European / US effort on the Ocean Data Interoperability Platform
  • Potential collaboration with SDSC team on annotating datasets with information gained from support archives
  • Experimentation with ESSI Labs on use of brokering tier web services technology

At the time of the last meeting, we were disappointed to learn that the ODIP (Ocean Data Interoperability Platform) was not funded by the European Commission.   Since that time, ODIP has been chosen for funding, so this will be new collaborative initiative.   We are now working with San Diego Supercomputing Center and Woods Hole to get the US part of the project funded by NSF.  Unidata's technologies (especially THREDDS and netCDF) are part of the project and we also maintain a liaison role to make out community aware of the work an possible applications.

http://seadatanet.maris2.nl/newsletter.asp#70

Planned Activities

The next steps in the CF-netCDF are to complete the standardization process for the Enhanced Data Model and the CF extension.   With these as standards, work will resume on the extension to the Web Coverage Service (WCS) extension for CF-netCDF encoding.   In addition a discussion paper on netCDF conventions for encapsulating uncertainty information has been approved.   We await the outcome of the discussion to determine whether this will eventually be proposed as an additional extension to the netCDF core standard.   Work is likely to accelerate on collaborations with OPeNDAP and HDF who are now active in the OGC.   An approach for dealing with the HDF5 encoding of the netCDF enhanced data model is still being sought..

Based on an earlier policy committee meeting presentation, I created a white paper based on my "Data Interactive Publications" presentation which seemed to be well received.  It's available at

https://sites.google.com/site/datainteractivepublications/home/white-paper-on-data-interactive-publications

Considerable support for this concept developed at the charrette and the concept was moved forward by a team lead by Tanu Malik of the University of Chicago.   However, it was not among the Expressions of Interest encouraged to submit an EAGER proposal.   The group is considering publishing an article based on the EarthCube whitepaper and subsequent work in an online journal.

A follow up presentation on this topic was presented as a keynote at the triennial Unidata User Workshop.

Relevant Metrics

  • One more netCDF-related OGC international standard (netCDF 4 data model)
  • One more netCDF-related OGC international standard well into the pipleline (CF conventions)
  • The list of "other collaborations above includes a dozen organizations we have regular interactions with.  In most cases, our interactions are as representatives of our community on their steering or policy groups, so we have at least some voice in their direction.
  • Over the years of these standardization efforts, ESRI has incorporated the netCDF among the input and output formats that their arcGIS tools work with directly.  This represents a user community that numbers in the millions, but it isn't possible for us to measure how many of those users now use it to access our data.
  • The standards efforts enable us to collaborate on an ongoing basis with dozens of international organizations -- especially those represented in the OGC MetOceans, Earth System Science, and Hydrology Domain Working Groups.